Skip to content

feat: marf squash engine#7060

Open
francesco-stacks wants to merge 43 commits into
stacks-network:developfrom
francesco-stacks:feat/marf-squash-engine
Open

feat: marf squash engine#7060
francesco-stacks wants to merge 43 commits into
stacks-network:developfrom
francesco-stacks:feat/marf-squash-engine

Conversation

@francesco-stacks
Copy link
Copy Markdown
Contributor

Description

Builds on top of #7057. Will be removed from Draft once that one will be merged

  • Add MARF::squash_to_path(): offline squash pipeline that DFS-collects nodes into a disk-backed NodeStore, remaps pointers, recomputes hashes, and streams a single shared trie blob
  • Make get_root_hash_at(), get_block_height_miner_tip(), and get_trie_root_ancestor_hashes_bytes() squash-aware so blocks inside the squashed range return correct per-height values from SQL side-tables

Applicable issues

  • fixes #

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • For new Clarity features or consensus changes, add property tests (see docs/property-testing.md)
  • Changelog fragment(s) or "no changelog" label added (see changelog.d/README.md)
  • Required documentation changes (e.g., rpc/openapi.yaml for RPC endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo

@francesco-stacks francesco-stacks self-assigned this Mar 30, 2026
@coveralls
Copy link
Copy Markdown

coveralls commented Mar 31, 2026

Coverage Report for CI Build 25864253919

Coverage increased (+0.06%) to 85.775%

Details

  • Coverage increased (+0.06%) from the base build.
  • Patch coverage: 256 uncovered changes across 9 files (2535 of 2791 lines covered, 90.83%).
  • 5293 coverage regressions across 102 files.

Uncovered Changes

File Changed Covered %
stackslib/src/chainstate/stacks/index/squash.rs 567 486 85.71%
stackslib/src/chainstate/stacks/index/test/squash.rs 1447 1395 96.41%
stackslib/src/chainstate/stacks/index/squash/stream.rs 175 134 76.57%
stackslib/src/chainstate/stacks/index/squash/node_store.rs 262 237 90.46%
stackslib/src/chainstate/stacks/index/trie.rs 41 23 56.1%
stackslib/src/chainstate/stacks/index/trie_sql.rs 59 45 76.27%
stackslib/src/chainstate/stacks/index/marf.rs 53 40 75.47%
stackslib/src/chainstate/stacks/index/mod.rs 8 0 0.0%
stackslib/src/chainstate/stacks/index/storage.rs 58 54 93.1%

Coverage Regressions

5293 previously-covered lines in 102 files lost coverage.

Top 10 Files by Coverage Loss Lines Losing Coverage Coverage
stackslib/src/config/mod.rs 358 69.04%
stackslib/src/chainstate/stacks/miner.rs 234 83.58%
stackslib/src/chainstate/stacks/index/storage.rs 229 82.08%
stackslib/src/net/inv/epoch2x.rs 226 79.21%
stackslib/src/chainstate/stacks/db/transactions.rs 203 97.13%
stackslib/src/net/chat.rs 201 93.01%
stackslib/src/chainstate/stacks/db/mod.rs 196 86.23%
stackslib/src/core/mempool.rs 170 86.87%
stackslib/src/chainstate/stacks/index/node.rs 161 87.28%
stacks-node/src/nakamoto_node/miner.rs 150 87.27%

Coverage Stats

Coverage Status
Relevant Lines: 222385
Covered Lines: 190751
Line Coverage: 85.78%
Coverage Strength: 18719152.56 hits per line

💛 - Coveralls

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the MARF “squash engine” to create disk-backed squashed snapshots and makes key MARF/trie lookup paths squash-aware so callers can get correct per-height root hashes and block heights when reading within the squashed range.

Changes:

  • Add MARF::squash_to_path() offline squashing pipeline (DFS node collection into a disk-backed NodeStore, pointer remap, hash recompute, and streaming a single trie blob).
  • Update trie/MARF lookup logic to consult squash SQL side-tables for correct block heights and per-height root hashes in squashed ranges.
  • Add extensive unit/integration tests covering squashing, extension behavior, pointer encoding, and patch/backpointer edge cases.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
stackslib/src/chainstate/stacks/index/trie.rs Make ancestor-hash collection squash-aware by reading block heights/root hashes from squash SQL side-tables where appropriate.
stackslib/src/chainstate/stacks/index/storage.rs Make get_root_hash_at() consult squash metadata so historical blocks inside the squash range return the correct archival root hash.
stackslib/src/chainstate/stacks/index/squash.rs Add the new squash engine implementation (NodeStore, remap/hash recompute, blob streaming, metadata persistence).
stackslib/src/chainstate/stacks/index/mod.rs Export the new squash module.
stackslib/src/chainstate/stacks/index/marf.rs Re-export squash constants/stats and make get_block_height_miner_tip() squash-aware via side-table lookups.
stackslib/src/chainstate/stacks/index/test/squash.rs Add comprehensive squash/extend regression tests and targeted unit tests for disk-backed mechanisms.
stackslib/src/chainstate/stacks/index/test/node.rs Add tests for inline back-block payload in compressed pointers and back_block preservation behavior.
stackslib/src/chainstate/stacks/index/test/node_patch.rs Add regression test ensuring patch application preserves inline back_block payload identity.
stackslib/src/chainstate/stacks/index/test/mod.rs Register the new squash test module.
stackslib/src/chainstate/stacks/index/test/marf.rs Rename/expose the MARF setup helper for reuse by squash tests; adjust one callback-propagation test to use smaller fixtures.
changelog.d/marf-squash-engine.added Changelog entry for squash engine + squash-aware lookups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/storage.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/marf.rs Outdated
Copy link
Copy Markdown
Contributor

@federico-stacks federico-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass and left some minors/nits around.

Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/test/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Comment thread stackslib/src/chainstate/stacks/index/squash.rs

/// Rewrite inline child pointers from in-memory node indices to blob-local
/// byte offsets. Backpointers and empty pointers are left untouched.
pub fn update_inline_child_ptrs(ptrs: &mut [TriePtr], file_offsets: &[u64]) -> Result<(), Error> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to start bikeshedding, but ... any good idea for a name for this function that makes it more obvious what's happening? "Update" is pretty generic and meaningless, especially considering how consequential the data modification is that it performs.

(I know you didn't actually add this function here, you just moved it)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in e820a0e

// height H for every block in the squashed range. Use the
// side-table when available.
if storage.is_squashed() {
if let Some(h) = trie_sql::read_squash_block_height(storage.sqlite_conn(), block_hash)?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that in the vast majority of cases, this function is going to becalled with the chain tip or very close to it.

If that's true, wouldn't it be better if we tried the MARF first, and only fall back to SQL later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a valid point. In normal operations that will be the case. we'll mostly read from the squash during clarity lookups, block replay, and the RPC endpoints in general. Anyway I did the change , it was a bit more involved than I expected, because the sql lookup first removed a couple of edge cases 70a1aa1

Comment on lines +2514 to +2515
if let Some(h) = trie_sql::read_squash_block_height(self.sqlite_conn(), tip)? {
return trie_sql::read_squash_archival_marf_root_hash(self.sqlite_conn(), h)?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes two SQL queries that are fulfilled from the same row -- first SELECT height WHERE hash, then SELECT root_hash WHERE height.

Feels like that should be a single SELECT root_hash WHERE hash?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added read_squashed_block_root_hash_by_hash in 20739d632

// we want to find the block at a given _height_. but how to do so?
// use the data stored already in the MARF.
let cur_block_height =
MARF::get_block_height_miner_tip(storage, &cur_block_header, &cur_block_header)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You updated get_block_height_miner_tip to handle squashed MARFs. Given that, why is this change here necessary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, yeah redundant db15da7


let root_ptr = storage.root_trieptr();
let ancestor_hash = storage.read_node_hash_bytes(&root_ptr)?;
let ancestor_height = cur_block_height - (1u32 << log_depth);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note to self: have a closer look at this change)

francesco-stacks added a commit to francesco-stacks/stacks-core that referenced this pull request May 13, 2026
Comment thread stackslib/src/chainstate/stacks/index/squash.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants